[SPARK-55025][PS] Improve performance in pandas by using list comprehension #53701

devin-petersohn · 2026-01-06T20:48:49Z

What changes were proposed in this pull request?

Improve the performance of various metadata and precomputing operations in pandas by using list comprehension.

Why are the changes needed?

Performance and maintainability

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI

Was this patch authored or co-authored using generative AI tooling?

No

…ension Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com> Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com>

github-actions · 2026-01-06T20:48:57Z

JIRA Issue Information

=== Bug SPARK-55025 ===
Summary: Pyspark pandas use of nested for loops
Assignee: None
Status: Open
Affected: ["4.1.1"]

This comment was automatically generated by GitHub Actions

python/pyspark/pandas/frame.py

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

…tain_03

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

holdenk · 2026-01-12T21:13:49Z

@devin-petersohn we should make a new JIRA for this since the old one is already resolved.

devin-petersohn · 2026-01-13T16:27:59Z

Sorry about that! Fixed the title and created a new issue.

HyukjinKwon

I guess this is good to go? wdyt @gaogaotiantian @holdenk

huaxingao

LGTM

huaxingao · 2026-01-17T03:44:12Z

Merged to master! Thanks @devin-petersohn for the PR! Thanks @holdenk @HyukjinKwon @gaogaotiantian for the review!

[SPARK-54787][PS] Improve performance in pandas by using list compreh…

7f11568

…ension Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com> Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com>

github-actions bot added PYTHON PANDAS API ON SPARK labels Jan 6, 2026

gaogaotiantian reviewed Jan 6, 2026

View reviewed changes

python/pyspark/pandas/frame.py Show resolved Hide resolved

devin-petersohn added 4 commits January 7, 2026 09:36

Lint

cd7ebce

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

Merge remote-tracking branch 'upstream/master' into devin/pandas_main…

e74fe2b

…tain_03

Merge remote-tracking branch 'upstream/master' into devin/pandas_main…

f49eaae

…tain_03

Make linter happy

99cd95d

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>

devin-petersohn changed the title ~~[SPARK-54787][PS] Improve performance in pandas by using list comprehension~~ [SPARK-55025][PS] Improve performance in pandas by using list comprehension Jan 13, 2026

HyukjinKwon approved these changes Jan 13, 2026

View reviewed changes

gaogaotiantian approved these changes Jan 14, 2026

View reviewed changes

huaxingao approved these changes Jan 17, 2026

View reviewed changes

asf-gitbox-commits closed this in 69f1d5c Jan 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-55025][PS] Improve performance in pandas by using list comprehension #53701

[SPARK-55025][PS] Improve performance in pandas by using list comprehension #53701

Uh oh!

devin-petersohn commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

holdenk commented Jan 12, 2026

Uh oh!

devin-petersohn commented Jan 13, 2026

Uh oh!

HyukjinKwon left a comment

Uh oh!

huaxingao left a comment

Uh oh!

huaxingao commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-55025][PS] Improve performance in pandas by using list comprehension #53701

[SPARK-55025][PS] Improve performance in pandas by using list comprehension #53701

Uh oh!

Conversation

devin-petersohn commented Jan 6, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

github-actions bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JIRA Issue Information

Uh oh!

Uh oh!

holdenk commented Jan 12, 2026

Uh oh!

devin-petersohn commented Jan 13, 2026

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

huaxingao left a comment

Choose a reason for hiding this comment

Uh oh!

huaxingao commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Jan 6, 2026 •

edited

Loading